15 research outputs found

    Introductory programming: a systematic literature review

    Get PDF
    As computing becomes a mainstream discipline embedded in the school curriculum and acts as an enabler for an increasing range of academic disciplines in higher education, the literature on introductory programming is growing. Although there have been several reviews that focus on specific aspects of introductory programming, there has been no broad overview of the literature exploring recent trends across the breadth of introductory programming. This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. Partitioning the literature into papers addressing the student, teaching, the curriculum, and assessment, we explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research

    Mining clinical relationships from patient narratives

    Get PDF
    Background The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order to support clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning (ML) approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to the extraction of clinical relationships. Results We have designed and implemented an ML-based system for relation extraction, using support vector machines, and trained and tested it on a corpus of oncology narratives hand-annotated with clinically important relationships. Over a class of seven relation types, the system achieves an average F1 score of 72%, only slightly behind an indicative measure of human inter annotator agreement on the same task. We investigate the effectiveness of different features for this task, how extraction performance varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships. Conclusion We have shown that it is possible to extract important clinical relationships from text, using supervised statistical ML techniques, at levels of accuracy approaching those of human annotators. Given the importance of relation extraction as an enabling technology for text mining and given also the ready adaptability of systems based on our supervised learning approach to other clinical relationship extraction tasks, this result has significance for clinical text mining more generally, though further work to confirm our encouraging results should be carried out on a larger sample of narratives and relationship types

    Learning Ensembles of First Order Clauses That Optimize Precision Recall Curves

    No full text
    Many domains in the field of Inductive Logic Programming (ILP) involve highly unbalanced data, such as biomedical information extraction, citation matching, and learning relationships in social networks. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy, and to examine their trade-offs by plotting a precision-recall curve. The goal of this thesis is to find new approaches within ILP particularly suited for large, highly skewed domains. I propose and investigate Gleaner, a randomized search method that collects good clauses from a broad spectrum of points along the recall dimension in recall-precision curves and employs thresholding methods to combine sets of selected clauses. I compare Gleaner to ensembles of standard theories learned by Aleph, a standard ILP algorithm, using a number of large relational domains. I find that Gleaner produces comparable testset results in a fraction of the training time and outperforms Aleph ensembles when given the same amount of training time. I explore extensions to Gleaner with respect to searching and combining clauses, namely finding ways to fully explore the hypothesis space as well as to make better use of those found clauses. I also use Gleaner to estimate the probability that a query is true, further investigate the properties underlying precision-recall curves, and then conclude with a discussion of future work in this area

    Combining Clauses with Various Precisions and Recalls to Produce Accurate Probabilistic Estimates

    No full text
    Abstract. Statistical Relational Learning (SRL) combines the benefits of probabilistic machine learning approaches with complex, structured domains from Inductive Logic Programming (ILP). We propose a new SRL algorithm, GleanerSRL, to generate probabilities for highly-skewed relational domains. In this work, we combine clauses from Gleaner, an ILP algorithm for learning a wide variety of first-order clauses, with the propositional learning technique of support vector machines to learn wellcalibrated probabilities. We find that our results are comparable to SRL algorithms SAYU and SAYU-VISTA on a well-known relational testbed.

    Portraits of persistent pain: a portfolio of work relating to the 'problem of pain'

    Get PDF
    Chronic pain is a mysterious and challenging problem that affects a significant number of people in a significant number of ways. Perhaps one of the greatest challenges of these conditions is the highly subjective nature of pain, coupled with the apparent absence of any observable abnormality. In this study Interpretative Phenomenological Analysis was adopted in order to gain access to elements of these subjective experiences and in response to its ‘invisibility’, a creative approach was also incorporated into its design. Seven working age female participants were recruited and invited to share aspects of their pain experience through both narrative accounts and pictorial representations. Participants’ images and their accounts of them provided a rich gestalt which communicates a range of difficulties in a single cohesive image, which in turn also served to compliment the other themes identified in the study. Participants’ unanimously found this feature of the study facilitative as well as cathartic and it is suggested that these positive experiences may also hold a significant clinical value. The current study supports that, by adopting multimodal methods as a means of exploring lived experience, a potential opportunity has arisen which could help to bridge the ‘gap’ between what is ‘seen’ and what is ‘felt.’ It is suggested that in the development of ever more creative means of approaching the ‘problem of pain,’ art and art therapy may be considered for its potential in helping patients to reveal aspects of their difficulties in order to be both better understood and supported

    Seven Semesters of Android Game Programming in CS2

    No full text
    Mobile game development is a topic that interests many computer science students. The author included an open-ended Android game development project as the final project in a CS2 course for four years. Over 7 semesters, 141 students produced 87 different mobile games, of which 29 were published on the Google Play store. This paper discusses the experience of how this Android project was integrated with the course, as well as the projects themselves, and examines the factors that led to successful student submissions

    Gleaner: Creating Ensembles of Firstorder Clauses to Improve Recall-Precision Curves

    No full text
    Abstract. Many domains in the field of Inductive Logic Programming (ILP) involve highly unbalanced data. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. The goal of our research is to find new approaches within ILP particularly suited for large, highly-skewed domains. We propose Gleaner, a randomized search method that collects good clauses from a broad spectrum of points along the recall dimension in recall-precision curves and employs an “at least L of these K clauses ” thresholding method to combine sets of selected clauses. Our research focuses on Multi-Slot Information Extraction (IE), a task that typically involves many more negative examples than positive examples. We formulate this problem into a relational domain, using two large testbeds involving the extraction of important relations from the abstracts of biomedical journal articles. We compare Gleaner to ensembles of standard theories learned by Aleph, finding that Gleaner produces comparable testset results in a fraction of the training time
    corecore